Cluster-based instance selection for machine classification
نویسندگان
چکیده
منابع مشابه
FISA: Feature-Based Instance Selection for Imbalanced Text Classification
Support Vector Machines (SVM) classifiers are widely used in text classification tasks and these tasks often involve imbalanced training. In this paper, we specifically address the cases where negative training documents significantly outnumber the positive ones. A generic algorithm known as FISA (Feature-based Instance Selection Algorithm), is proposed to select only a subset of negative train...
متن کاملSimple Incremental Instance Selection Wrapper for Classification
Instance selection methods are very useful data mining tools for dealing with large data sets. There exist many instance selection algorithms capable for significant reduction of training data size for particular classifier without generalization degradation. In opposition to those methods, this paper focuses on general pruning methods which can be successfully applied for arbitrary classificat...
متن کاملSFLA Based Gene Selection Approach for Improving Cancer Classification Accuracy
In this paper, we propose a new gene selection algorithm based on Shuffled Frog Leaping Algorithm that is called SFLA-FS. The proposed algorithm is used for improving cancer classification accuracy. Most of the biological datasets such as cancer datasets have a large number of genes and few samples. However, most of these genes are not usable in some tasks for example in cancer classification....
متن کاملAn Analysis of Instance Selection Algorithms using Support Vector Machine for Text Classification
Automatic text classification is a popular research topic in text mining. Automatic text classification is an eminent field of research in text mining, which is tries to automatically classify the text documents into pre-specified categories. Text mining involves several pre-processing and classification techniques. In this paper, we have analysed several feature selection methods with support ...
متن کاملParFDA for Instance Selection for Statistical Machine Translation
We build parallel feature decay algorithms (ParFDA) Moses statistical machine translation (SMT) systems for all language pairs in the translation task at the first conference on statistical machine translation (Bojar et al., 2016a) (WMT16). ParFDA obtains results close to the top constrained phrase-based SMT with an average of 2.52 BLEU points difference using significantly less computation for...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Knowledge and Information Systems
سال: 2011
ISSN: 0219-1377,0219-3116
DOI: 10.1007/s10115-010-0375-z